Model Selection

Low-resource Inference

# Low-resource Inference

Magtie V1 12B GGUF

A collection of GGUF quantized versions of MagTie-v1-12B, created by merging pre-trained language models using mergekit, suitable for text generation tasks.

Large Language Model

Qwen Qwen3 8B GGUF

GGUF format quantized version of Qwen3-8B, provided by TensorBlock, compatible with llama.cpp

Large Language Model

Deepseek R1 GGUF UD

DeepSeek-R1 is an efficient large language model that employs Unsloth Dynamic v2.0 quantization technology to achieve outstanding accuracy.

Large Language Model English

Orpheus 3b Kaya Q2 K.gguf

A text-to-speech model fine-tuned from Canopy Labs' pre-trained model, supporting English, using GGUF Q2_K quantization format for efficient inference

Speech Synthesis Supports Multiple Languages

Meta Llama Llama 4 Scout 17B 16E Instruct Old GGUF

Llama-4-Scout-17B-16E-Instruct is a 17B parameter instruction fine-tuned large language model released by Meta, which has undergone quantization processing to improve operational efficiency.

Large Language Model Supports Multiple Languages

Gemma 3 4b It Abliterated Q4 0 GGUF

This model is a GGUF format conversion of mlabonne/gemma-3-4b-it-abliterated, combined with the visual component of x-ray_alpha for a smoother multimodal experience.

Gemma 3 4b It Q4 0

Gemma 3 4B Instruct is a 4-billion-parameter large language model developed by Google, focusing on text generation and comprehension tasks.

Large Language Model

Doge 120M MoE Instruct

The Doge model employs dynamic masked attention mechanisms for sequence transformation and can use multi-layer perceptrons or cross-domain mixture of experts for state transitions.

Large Language Model

Transformers English

Bge Reranker Base Q4 K M GGUF

GGUF format re-ranking model converted from BAAI/bge-reranker-base, supporting Chinese and English text sorting tasks

Text Embedding Supports Multiple Languages

Thedrummer Fallen Gemma3 4B V1 GGUF

This is a quantized version of TheDrummer/Fallen-Gemma3-4B-v1 model, processed using llama.cpp, suitable for text generation tasks.

Large Language Model

Gemmax2 28 9B V0.1 Q2 K GGUF

GemmaX2-28-9B-v0.1-Q2_K-GGUF is a GGUF format model converted from ModelSpace/GemmaX2-28-9B-v0.1, supporting multilingual translation tasks.

Large Language Model Supports Multiple Languages

Qwen2.5 Bakeneko 32b Instruct V2 Gguf

This is a quantized version of rinna/qwen2.5-bakeneko-32b-instruct-v2 using llama.cpp, compatible with various llama.cpp-based applications.

Large Language Model Japanese

Gemma 3 4b It Q4 K M GGUF

Gemma 3.4B IT is an open-source large language model developed by Google. This version is the 4-bit quantized version converted to GGUF format via llama.cpp.

Large Language Model

Google.gemma 3 4b It GGUF

Gemma 3.4B IT is a 3.4 billion parameter large language model developed by Google, focusing on the instruction-tuned version, suitable for various natural language processing tasks.

Large Language Model

Text Summarization Q8 0 GGUF

This model is a GGUF-format text summarization model converted from Falconsai/text_summarization, designed for efficient inference via llama.cpp.

Text Generation English

Nousresearch DeepHermes 3 Llama 3 8B Preview GGUF

A dialogue model fine-tuned based on Llama-3-8B, supporting multiple quantization versions, suitable for tasks such as chatting, reasoning, and role-playing.

Large Language Model English

A fusion model specifically designed for role-playing and creative writing, combining Rei-12B and Francois-Huali-12B via Slerp algorithm

Large Language Model

Llama 3.1 0x Mini Q8 0 GGUF

This is a GGUF format model converted from ozone-ai/llama-3.1-0x-mini, suitable for the llama.cpp framework.

Large Language Model

Senecallm X Qwen2.5 7B CyberSecurity Q8 0 GGUF

This is a large language model for the cybersecurity domain based on the Qwen2.5-7B architecture, converted to GGUF format for use with llama.cpp.

Large Language Model English

Mistral Portuguese Luana 7b Mental Health Q5 K M GGUF PTBR

This is a Portuguese-based Mistral model, specifically fine-tuned for the mental health domain, suitable for Portuguese text generation tasks.

Large Language Model Other

Suzume Llama 3 8B Multilingual

Suzume 8B is a multilingual fine-tuned version based on Llama 3, trained on nearly 90,000 multilingual dialogues to enhance multilingual communication capabilities while maintaining Llama 3's intelligence level.

Large Language Model

Saiga Llama3 8b

A Russian chat assistant based on Llama-3 8B Instruct, specially trained to provide Russian dialogue support.

Large Language Model

Transformers Other

Percival 01 7b Slerp

Percival_01-7b-slerp is a 7B-parameter large language model ranked second on the OPENLLM leaderboard, obtained by merging the liminerity/M7-7b and Gille/StrangeMerges_32-7B-slerp models using the LazyMergekit tool.

Large Language Model

Saul Instruct V1 GGUF

Saul-Instruct-v1-GGUF is the GGUF format version of the Equall/Saul-Instruct-v1 model, suitable for text generation tasks and supports multiple quantization levels.

Large Language Model English

Tinymixtral 4x248M MoE

TinyMixtral-4x248M-MoE is a small language model adopting the Mixture of Experts (MoE) architecture, formed by merging multiple TinyMistral variants, suitable for text generation tasks.

Large Language Model

Tinymistral 6x248M Instruct

A language model fine-tuned based on the Mixture of Experts (MoE) architecture, which fuses multiple models through the LazyMergekit framework and performs excellently in instruction tasks.

Large Language Model

Transformers English

Tinymistral 6x248M

TinyMistral-6x248M is a Mixture of Experts system that integrates 6 TinyMistral variants using the LazyMergekit tool, pre-trained on the nampdn-ai/mini-peS2o dataset

Large Language Model

Laser Dolphin Mixtral 2x7b Dpo

A medium-scale Mixture of Experts (MoE) implementation based on Dolphin-2.6-Mistral-7B-DPO-Laser, with an average performance improvement of approximately 1 point in evaluations

Large Language Model

Beyonder 4x7B V2

Beyonder-4x7B-v2 is a large language model based on the Mixture of Experts (MoE) architecture, consisting of 4 expert modules, each specializing in different domains such as dialogue, programming, creative writing, and mathematical reasoning.

Large Language Model

Mamba-1B is a 1B-parameter language model based on the Mamba architecture, supporting English text generation tasks.

Large Language Model

Transformers English

Swallow 70B Instruct GGUF

Swallow 70B Instruct is a powerful language model that provides model files in GGUF format, supports multiple clients and libraries, and can meet text generation needs in different scenarios.

Large Language Model

Transformers Supports Multiple Languages

Dolphin 2.5 Mixtral 8x7b GPTQ

Dolphin 2.5 Mixtral 8X7B is a large language model developed by Eric Hartford based on the Mixtral architecture, fine-tuned on multiple high-quality datasets, suitable for various natural language processing tasks.

Large Language Model

Transformers English

Causallm 7B GGUF

CausalLM 7B is a multilingual large language model based on the Llama 2 architecture, supporting English and Chinese text generation tasks.

Large Language Model Supports Multiple Languages

Jellyfish-13B is a 13-billion-parameter large language model specifically customized for data preprocessing tasks, including error detection, data imputation, pattern matching, and entity matching.

Large Language Model

Transformers English

Whisper Large Onnx Int4 Inc

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. This repository provides the Whisper large model in ONNX format with INT4 weight quantization, powered by Intel® Neural Compressor and Intel® Transformers Extension.

Speech Recognition

Mythomax L2 13B AWQ

The AWQ quantized version of MythoMax L2 13B, which can effectively improve inference efficiency.

Large Language Model

Transformers English

Mythalion 13B GGUF

Mythalion 13B is a 13B-parameter large language model developed by PygmalionAI, based on the Llama architecture, specializing in text generation and instruction-following tasks.

Large Language Model English

Llama 2 13B GGUF

Llama 2 is an open-source large language model series developed by Meta. The 13B version is a medium-scale model suitable for various text generation tasks.

Large Language Model English

Codellama Chat 13b Chinese

CodeLlama is a model specifically designed for code assistance, excelling in handling programming-related Q&A and supporting multi-turn dialogues in Chinese and English.

Large Language Model

Transformers Supports Multiple Languages

Pygmalion 6b 4bit 128g

A 4-bit GPTQ quantized model based on Pygmalion-6B, suitable for dialogue generation tasks, supporting English text generation

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase